Traffic manager and a method for a traffic manager

ABSTRACT

The present invention relates to a traffic manager ( 1 ) and a method for a traffic manager ( 1 ), the method comprising the step of reading a first data packet (D 1 -D D ) comprised in a first queue (Q I -Q Q ) based on a scheduling priority (SP), the scheduling priority (SP) being determined:—at least partly on a configured priority of the first queue (Q I -Q Q ),—at least partly on a first meter value (MV l -MV M ) of a first meter (Mi-MM) associated with the first queue (Q I -Q Q ); and—at least partly on a second meter value (MV l -MV M ) of a second meter (M l -M M ) associated with a first scheduling node (N I -N N ); the first scheduling node being a parent node of the first queue.

TECHNICAL FIELD

The present invention relates to a traffic manager comprising ascheduler, a number of queues and a number of scheduling nodeshierarchically arranged at one or more scheduling levels under a rootnode, each scheduling node being configured to serve a queue or ascheduling node of a lower scheduling level according to priorities.

BACKGROUND

In communication networks, services with a wide range of trafficcharacteristics and quality of service requirements may share thebandwidth of a physical or logical network interface. Examples ofservices with different requirements are voice, video, best effort andcontrol messaging. A service can have a minimum rate, which isguaranteed in most cases. However, in some applications, e.g. broadbandaggregation, the guaranteed minimum rate is oversubscribed, i.e. theminimum rate cannot be guaranteed at every time, causing reduced qualityof service, e.g. increasing the latency for a service, in periods withexceptionally high bandwidth demands.

Assume a tree hierarchy where the root represents a 100 Mbps Ethernetinterface; the children of the root are virtual local networks (VLANs);and each VLAN has two queues representing leaves in the tree, where aqueue stores packets belonging to a service. Further, assume that eachVLAN has a maximum rate of 8 Mbps and that one of the two queues has aminimum bandwidth guarantee of 5 Mbps. If minimum rate oversubscriptionis not allowed, then the 100 Mbps Ethernet interface cannot support morethan 20 VLANs, since the sum of minimum bandwidth guarantees across allVLANs is: 5 Mbps*20<=100 Mbps. On the other hand, if oversubscription isallowed, more than 20 VLANs can be supported.

Since not all users are active at the same time, oversubscription can behandled by statistical multiplexing, whereby the sum of allocatedminimum rates in the scheduler may exceed the total available rate.

However, situations may exist where the sum of demanded minimum rateexceeds the sum of allocated minimum rate. Even if such situations onlymay occur with a small probability, it is desirable to be able controlthose situations.

US 2005/0249220 A1 to Olsen et al. describes a hierarchical trafficmanagement system and method ensuring that each of multiple queues isguaranteed a minimum rate, that excess bandwidth is shared in accordancewith predefined weights, that each queue does not exceed a specifiedmaximum rate, and that the data link is maximally utilized within themaximum rate constraints.

In US 2005/0249220 A1 to Olsen et al., each queue or node has two setsof attributes; enqueue attributes and dequeue attributes. The enqueueattributes control how data packets enter a queue, and as such controlthe depth of the queue. The dequeue attributes control how data packetsexit the queue, and as such control scheduling of the queue with respectto other queues. Further, Olsen et al. describe minimum rate propagationwhich allows child nodes to be configured with a minimum rate, eventhough the parent node does not have an equal or greater minimum rate.By the minimum rate propagation, the parent node has a conditionalminimum rate guarantee, meaning that when traffic is present on thechild node that has a minimum rate guarantee, the parent also has theminimum rate guarantee to be used only for traffic coming from the childwith the guarantee.

The minimum rate propagation disclosed by Olsen et al. providesefficiency in applications where oversubscription is common and where itis not possible or desirable to give each parent node its own guarantee,yet delivery of some guaranteed service for some child node services isrequired.

One drawback with the method and system disclosed by US 2005/0249220 A1to Olsen et al. is that priorities are only propagated from a child nodeto a parent node and not further in the hierarchy. Thus it is notpossible to, in an accurate way, handle cases in which the sum ofminimum rates in child nodes are higher than the sum of minimum rates inparent nodes. Thus, Olsen et al cannot handle aggregation services andtherefore not controlling bandwidth allocation in case of minimum rateoversubscription. Another drawback is that the priority attribute isassociated with a single user defined bandwidth, whereby any traffic upto this bandwidth is regarded as priority traffic and is given priorityover other queues of a lower priority, causing bandwidth to bedistributed among the traffic in high priority queues in dependence onthe scheduling algorithm used.

US 2007/0104210 A1 to Wu et al. describes dynamic management of buffersand scheduling of data transmission with minimum and maximum shaping offlows of data packets in a network device so that all of the outputbandwidth can be fairly and fully utilized according to setrequirements.

For each queue, during minimum bandwidth guarantee shaping, thescheduler will be selected based on round robin scheduling or strictpriority scheduling, based on a separate minimum bandwidth strictpriority register.

After satisfying minimum bandwidth guarantees, each queue is enteredinto a maximum bandwidth allowable region, where the scheduler will useeither weighted deficit round robin (WDRR) or strict priority (SP) topick a data packet from different quality of service (QoS) queues.

Neither US 2007/0104210 A1 to Wu et al. disclose a method or a systemcapable of handling aggregation services and the disclosed method andsystem is therefore not capable of controlling bandwidth allocation incase of minimum rate oversubscription.

SUMMARY

It is an object of the present invention to overcome the problems withoversubscription of a minimum rate guarantee in a communication network.Specifically, an object of the present invention is to share bandwidthin a controlled manner when the sum of demanded minimum rate is largerthan the sum of allocated minimum rate. In other words, it is an objectof the present invention, to provide means for flexible and predictablebandwidth allocation in cases of oversubscription of minimum guaranteedrates.

Another object of the present invention is to guarantee minimum rate atany scheduling level.

The objects are reached by a traffic manager and a method for a trafficmanager comprising a scheduler, a number of queues and a number ofscheduling nodes hierarchically arranged at one or more schedulinglevels, each scheduling node being configured to serve a queue or ascheduling node of a lower scheduling level according to priorities, themethod comprising the step of reading a first data packet comprised in afirst queue based on a scheduling priority. The scheduling prioritybeing determined at least partly on a configured priority of the firstqueue; at least partly on a first meter value of a first meterassociated with the first queue; and at least partly on a second metervalue of a second meter associated with a first scheduling node; thefirst scheduling node being a parent node of the first queue.

Embodiments of the present invention are defined in the dependentclaims.

DETAILED DESCRIPTION OF DRAWINGS

Embodiments of the present invention will be described in more detailwith reference to the following drawings, in which:

FIG. 1 a schematically illustrates a block diagram of a traffic manageraccording to an embodiment of the present invention;

FIG. 1 b schematically illustrates a logical view of a traffic manageraccording to an embodiment of the present invention;

FIG. 2 schematically illustrates an example of a scheduling hierarchy;

FIG. 3 schematically illustrates scheduling of queues at an A schedulingnode;

FIG. 4 schematically illustrates scheduling of scheduling nodes at B, C,or P level scheduling nodes, without propagated priorities;

FIG. 5 schematically illustrates scheduling of scheduling nodes at B, C,or P level scheduling nodes, with propagated priorities;

FIG. 6 schematically illustrates priority propagation in a schedulinghierarchy;

FIG. 7 schematically illustrates a network processor comprising anembodiment of the inventive traffic manager; and

FIG. 8 schematically illustrates a router/switch comprising one or moretraffic managers.

DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTIONS

The present invention will now be described in more detail withreference to the drawings, in which drawings same reference numeralsindicate the same or corresponding features, components or means.

FIG. 1 a schematically illustrates a block diagram of a traffic manager1 according to an embodiment of the present invention. FIG. 1 bschematically illustrates a logical view of a traffic manager (TM) 1according to an embodiment of the present invention. The traffic manager1 may be arranged to provide buffering, queuing and scheduling of datapackets in a network system. The traffic manager 1 may be comprised in anetwork processor 10 of the network system, cf. FIG. 7. However, it mayalso be arranged as a stand alone device, which may be arranged externalof and in communication with a network processor.

The traffic manager may be arranged for multiple purposes in a networksystem. For example the traffic manager may be configured for ingresstraffic management, egress traffic management, and virtual outputqueuing. However, it should be understood that the traffic manager maybe configured for other purposes in a network system.

In ingress and egress traffic management, the traffic manager algorithmsmay be configured to ensure that the bandwidth is shared according toservice level agreements. The services level agreements describe e.g.quality of service parameters such as minimum bandwidth, maximumbandwidth, latency, jitter and loss probability for traffic receivedfrom or transmitted to a network user.

In virtual output queuing, the traffic manager queues and schedules datapackets at an input of a switch fabric depending on quality of serviceparameters assigned to data packets in the queues and the availabilityof bandwidth and data storage in the switch fabric or at the outputs ofthe switch fabric.

The network processor can be any of a variety of known types; includingthe processor described in the international patent application no.PCT/EP2007/055777 which is incorporated herein by reference. Theprocessor may comprise processing means of a variety of known types;including an asynchronous processing pipeline, as described in theinternational patent application no. PCT/SE2005/0019696 which isincorporated herein by reference.

The traffic manager 1 is arranged to receive data packets via one orseveral input ports 2 and to make a decision what action to take basedon packet information. The traffic manager 1 may therefore comprisemeans for packet inspection 2. The data packet information may becomprised in the data packet, e.g. in a protocol header. Alternativelyor in addition, the packet information may be transferred as a side-bandinformation with the data packet, e.g. by attributes set by the entitytransferring data packets to the traffic manager.

Further, the traffic manager may be arranged to select a queue to whichthe data packet is to be written. The selection of queue may be based onthe packet information. For example, the queue number may be comprisedin the packet information and by reading the packet information thetraffic manager knows the queue to which the data packet is to bewritten.

The traffic manager may also be arranged to decide whether to drop adata packet by drop means 4 or to enqueue a data packet by enqueue means5 based on the packet information, based on queue information, e.g. thecurrent queue length or average queue length and/or based on parametersstored in the traffic manager and associated with the queue, such asdrop thresholds and parameters for active queue management, e.g.weighted random early discard.

If a data packet is written to a queue Q it becomes available forscheduling and dequeuing after any preceding data packet in the queuehas been dequeued. The queue may be served by a scheduler 6, which inturn may be served by other schedulers at higher levels in a schedulinghierarchy. A scheduler 6 is arranged to use a scheduling algorithm todetermine the service order of queues or scheduler nodes. A dequeuemeans 7 may be arranged to dequeue a data packet. Non-limiting examplesof scheduling algorithms comprise first-come-first-serve, time-divisionmultiplexing, round-robin, weighted round-robin, strict priorityqueuing, deficit round-robin, deficit weighted round-robin, weightedfair queuing and earliest deadline first.

Dequeing from a queue or from a scheduler node may be temporarilyblocked by traffic shapers, such as leaky buckets or token buckets, orby backpressure signals received by the traffic manager. Thebackpressure signals may be received from a rate-limited output portwhose local storage for data pending for transmission is almost full.

On dequeing, a data packet is transmitted through the output port 8 ofthe traffic manager. Optionally, the traffic manager is arranged topost-process the data packet before transmission. For example, thetraffic manager may be arranged to edit the data packet header or towrite information to side-band attributes.

In FIG. 1 b, data packets D₁-D_(D) enter the traffic manager 1 through adata interface comprising one or more input ports 2, and are stored inone or more input buffers Q₁-Q_(Q), such as one or more queues, beforebeing scheduled in a manner described below. After being scheduled, thedata packets D₁-D_(D) are read from the queues and exit the trafficmanager through one or more output ports 8.

Further, traffic manager 1 comprises a number of scheduling nodesN₁-N_(N) hierarchically arranged at one or more scheduling levelsL₁-L_(L), each scheduling node N₁-N_(N) being configured to serve aqueue Q₁-Q_(Q) or a scheduling node N₁-N_(N) of a lower scheduling levelL₁-L_(L) according to priorities.

The traffic manager 1 comprises a scheduler 6, e.g. a hierarchicalscheduler, that may comprise a number, e.g. four, identical schedulers,each scheduler having a single scheduling hierarchy and being allocatedto a set of traffic interfaces; matching its data bandwidth and packetrate. The lowest leaves of a scheduler hierarchy are the queues Q. Datapackets from the queues are successively aggregated by scheduling nodesN arranged in a number, e.g. five, levels. Each scheduler can beflexible configured hierarchically featuring 8192 queues and more the2048 internal scheduling nodes which are shared between the trafficinterfaces allocated to the scheduler. FIG. 2 schematically illustratesan example of a scheduling hierarchy.

As illustrated in FIG. 2, data packets D are stored in 8192 FIFO(first-in-first-out) queues Q, and four queues are mapped to one out of2048 level A scheduling nodes N_(A). 1-1024 level A scheduling nodesN_(A) are mapped to one out of 512 level B scheduling nodes N_(B), and1-512 level B scheduling nodes N_(B) are mapped to one out of 128 levelC scheduling nodes N_(C). 1-128 level C scheduling nodes N_(C) aremapped to one out of 16 ports P, and 1-16 ports P are mapped to thescheduler tree T.

As an example, an egress scheduler at a user side of an oversubscribedMetro Ethernet system may assign 12 ports to Gigabit Ethernetinterfaces. Within each port, level B and level C correspond to logicalinterfaces and services, level A corresponds to users, and the queuescorrespond to applications. It should be understood that the number oflevels may vary. Further, if a more shallow hierarchy is needed, level Bor level C may for example be configured as transparent dummy layersconnecting to upper or lower level nodes one-to-one.

Each of the scheduling nodes N_(A), N_(B), and N_(C), and each of theports P may comprise a strict priority scheduler, a round robin (RR)scheduler or a deficit weighted round robin (DWRR) scheduler. Thescheduler tree T may comprise a strict priority scheduler or a roundrobin (RR) scheduler.

Further, as illustrated in FIGS. 1 b and 2, each of the queues Q,scheduling nodes N, and ports P is associated with dual rate shapers,e.g. dual token bucket shapers. The rate shapers are bit rate shapers.

In this description, one of the dual rate shapers, a minimum rate shaperalso called a minimum (committed) token bucket shaper, is referred to asa meter M while the other, a maximum rate shaper also called maximum(excess) token bucket shaper, is referred to as a shaper S.

The meter M is configured to define a dynamic priority, whereby thepriority of the associated queue, node or port, can be dynamicallychanged between a high priority level and a low priority level independence of the relation between a meter value MV and a meter limitvalue MLV of the meter M. As illustrated in FIG. 1 b, the first meter M₁has a meter value MV₁ that is less than the meter limit value MLV₁giving that the priority is low. However, if the meter value MV₁ isequal to or larger than the meter limit value MLV₁, the priority wouldbe high.

The shaper S is configured to limit the bit rate to the output port 8 ofthe traffic manager 1.

A first data packet D₁-D_(D) comprised in a first queue Q₁-Q_(Q) is readfrom the first queue and for example admitted to the output port 8 ofthe traffic manager lbased on a scheduling priority SP. The schedulingpriority SP is determined at least partly on a configured priorityCP₁-CP_(Q) of the first queue Q₁-Q_(Q), cf. FIG. 3; at least partly on afirst meter value MV₁-MV_(M) of a first meter M₁-M_(M) associated withthe first queue Q₁-Q_(Q); and at least partly on a second meter valueMV₁-MV_(M) of a second meter M₁-M_(M) associated with the firstscheduling node N₁-N_(N).

In embodiments, the scheduling priority may further be determined atleast partly on a configured priority CP₁-CP_(N) of a first schedulingnode N₁-N_(N); the first scheduling node N₁-N_(N) being a parent node ofthe first queue Q₁-Q_(Q).

The first and second meters M₁-M_(M) may be so-called rate shapers andmay be provided in any suitable form, for example as software program,or part thereof, or as digital or analogue circuits of electrical,optical or mechanical components. The shapers use loose or strict tokenbuckets algorithms, so that admittance of data is based on a value of acredit parameter. However, any other suitable admittance algorithm maybe used.

The meters M₁-M_(M) being configured to dynamically change its prioritylevel between a high priority level when the meter value MV₁-MV_(M) isequal to or higher than a meter limit value MLV and a low priority levelwhen the meter value MV₁-MV_(M) is lower than the meter limit value MLV.

The meter limit values MLV₁ and MLV₂ of the first and second metersM₁-M_(M), respectively, are set to zero, the first and second meterswill have a high priority if their meter value is equal to or greaterthan zero, and they will have a low priority if their meter value isless than zero.

The first and second meter values MV₁-MV_(M) of the first and secondmeters M₁-M_(M) are decreased by an amount corresponding to the amountof bits of the first data packet D₁-D_(D), if the first data packet isread from the queue and admitted to the output port 8 of the trafficmanager 1. Optionally, the meter values may in addition be decreased bya shaping offset to compensate in advance for future changes in packetsize; e.g, by adding or removing packet headers. The shaping offset maybe positive, zero, or negative. The shaping offset may be configured perqueue or per node or be passed to the traffic manager with the packet asan attribute.

Further, the meter values MV₁-MV_(M) of the first and second metersM₁-M_(M) are periodically increased, e.g. every clock cycle of theprocessor, by a meter value amount. The meter value amount may be setduring configuration and may be given by amount/interval [bits/s]). Inone embodiment, if the thus increased meter value exceeds configurableburst size parameters BS1-BSM the meter values are set to BS1-BSM.

It should be understood that in embodiments of the present invention,the first and second meter values MV₁-MV_(M) of the first and secondmeters M₁-M_(M) may be increased by a suitable amount, e.g. by an amountcorresponding to the amount of bits of the first data packet D₁-D_(D),if the first data packet is read from the queue and admitted to theoutput port 8 of the traffic manager 1.

In such embodiments, the priority level of the meter may be high whenthe meter value is less than or equal to a meter limit value, and thepriority level may be low when the meter value is higher than the meterlimit value.

Further, it should be understood that in embodiments wherein the firstand second meter values MV₁-MV_(M) are increased if the first datapacket is read from the queue and admitted to the output port 8 of thetraffic manager 1, the meter values MV₁-MV_(M) of the first and secondmeters M₁-M_(M) are periodically decreased, e.g. every clock cycle of aprocessor, by a meter value amount.

In embodiments, the reading of the first data packet D₁-D_(D) andfurther admittance of the first data packet D₁-D_(D) to the output portof the traffic manager may also be based:

-   -   at least partly on a first shaper value SV₁-SV_(M) of a first        shaper S₁-S_(M) associated with the first queue Q₁-Q_(Q); and    -   at least partly on a second shaper value SV₁-SV_(M) of a second        shaper S₁-S_(M) associated with the first scheduling node        N₁-N_(N).

The shapers may be provided in any suitable form, for example assoftware program, or part thereof, or as digital or analogue circuits ofelectrical, optical or mechanical components. The shapers use loose orstrict token buckets algorithms, so that admittance of data is based ona value of a credit parameter. However, any other suitable admittancealgorithm may be used.

A first and a second shaper value SV₁-SV_(M) of the first and the secondshapers S₁-S_(M) associated with the first queue Q₁-Q_(Q) and the firstscheduling node N₁-N_(N), respectively, are decreased with a valuecorresponding to the amount of bits of the first data packet D₁-D_(D),if the first data packet D₁-D_(D) is read from the first queue Q₁-Q_(Q).

Further, the first and second shaper values SV₁-SV_(M) of the first andsecond shapers S₁-S_(M) are periodically increased, e.g. every clockcycle of the processor 2, by a shaper value amount. The shaper valueamount may be set during configuration and may be given byamount/interval [bits/s]). In one embodiment, if the thus increasedshaper value exceeds configurable burst size parameters BS1-BSM theshaper values are set to BS1-BSM.

It should be understood that in embodiments of the present invention,the first and second shaper values SV₁-SV_(M) of the first and secondshapers S₁-S_(M) may be increased by a suitable amount, e.g. by anamount corresponding to the amount of bits of the first data packetD₁-D_(D), if the first data packet is read from the queue and admittedto the output port 8 of the traffic manager 1.

Further, it should be understood that in embodiments wherein the firstand second shaper values SV₁-SV_(M) are increased if the first datapacket is read from the queue and admitted to the output port 8 of thetraffic manager 1, the shaper values SV₁-SV_(M) of the first and secondshapers S₁-S_(M) are periodically decreased, e.g. every clock cycle of aprocessor, by a shaper value amount.

In embodiments, the reading of the first data packet D₁-D_(D) andfurther admittance of the first data packet D₁-D_(D) to the output portof the traffic manager may also be based:

-   -   at least partly on a propagated priority PP corresponding to a        scheduling priority SP propagated from the first queue Q₁-Q_(Q)        at a lower scheduling level L₁-L_(L) to the first scheduling        node N₁-N_(N) at a higher scheduling level L₁-L_(L).

Further, the reading of the first data packet D₁-D_(D) and furtheradmittance of the first data packet D₁-D_(D) to the output port of thetraffic manager may also be based:

-   -   at least partly on a propagated priority PP corresponding to a        scheduling priority SP propagated from the first scheduling node        N₁-N_(N) at a lower scheduling level L₁-L_(L) to a parent        scheduling node N₁-N_(N) at a higher scheduling level L₁-L_(L).

The configured priority CP of the first queue Q₁-Q_(Q) may be strictpriority, e.g. high priority, medium priority or low priority; ordynamic priority. FIG. 3 shows queues configured to have high priority,medium priority and dynamic priority.

In embodiments, the scheduling priority SP at a queue level is 3, 2, 1,0, DC; wherein 3 being the highest scheduling priority, and wherein DCstands for “do not care”, i.e. scheduling priority has no effect on thescheduling decision.

The configured priority CP of the first scheduling node N₁-N_(N) may bestrict priority (SPQ) or normal priority.

In embodiments, the scheduling priority SP at a node level is 5, 4, 3,2, 1, 0, DC; 5 being the highest scheduling priority.

However, it should be understood that the present invention is notlimited to the priorities, e.g. the configured priorities and schedulingpriorities, given, but these priorities are only to be considered asexamples.

FIG. 3 schematically illustrates scheduling of queues at an A schedulingnode N_(A). As illustrated in FIG. 3, the queues Q₀-Q₇ have theconfigured priorities; high, medium, medium, dynamic, dynamic, dynamic,dynamic, dynamic, respectively. Every queue except Q₃ comprises a datapacket D. Further, each of the queues has a meter M having a meter valueMV and a shaper S having a shaper value SV. The queues Q₀-Q₄, and Q₆have meter values less than the meter limit values (shown as empty meterbuckets in the figures), indicating that the queues Q₀-Q₄, and Q₆ havelow priority. The queues Q₅ and Q₇ have meter values MV₅ and MV₇ largerthan the respective meter limit values (shown as filled buckets in thefigures), indicating that the queues Q₅ and Q₇ have a high priority.

Further, each of the queues Q₀-Q₂, and Q₄-Q₆ has a shaper S having ashaper value SV equal to or larger than the shaper limit value (shown asfilled shaper buckets in the figures). The queues Q₃ and Q₇ have shapersS₃ and S₇, respectively, having shaper values SV₃ and SV₇, respectively,less than the respective shaper limit value (shown as empty shaperbuckets in the figures).

As illustrated, at an A scheduling node, the queues Q₀-Q₂ having astrict configured priority, e.g. a high or medium configured priority,are scheduled in accordance with their configured priority as long asthe queue contains a data packet and has shaping tokens i.e. a shapingvalue SV larger than the shaping limit value.

Further, for a queue having a dynamic configured priority both the meterM and the shaper S is used for scheduling. All dynamically configuredqueues comprising a data packet and having a meter value MV larger thanthe meter limit value MLV is scheduled before dynamically configuredqueues having a meter value MV less than the meter limit value. If twodynamically configured queues have equal meter values, the queue havingthe highest shaper value may be scheduled before the other queue.

As illustrated in FIG. 3, the A scheduling node N_(A) may comprise fourround robin (RR) or weighted fair queuing (WFQ) schedulers and a strictpriority (SPQ) scheduler. The result from each of the four RR or WFQschedulers are queue Q0, queue Q2, queue Q5, and queue Q4 associatedwith four strict scheduling priority values: 3, 2, 1, 0, respectively. 3being the highest scheduling priority and 0 the lowest schedulingpriority. After a strict priority scheduling, queue Q₀ will be selected.

FIG. 4 schematically illustrates scheduling of scheduling nodes at B, C,or P level scheduling nodes, without propagated priorities. Asillustrated, the nodes N0 and N2 have the configured priority strictpriority (SPQ) and will therefore be mapped to the SPQ scheduler of thenode N_(B), N_(C) or port P and scheduled with the highest priority.

The nodes N₁, and N₃-N₇ have the configured priority normal, and willtherefore be scheduled in dependence of the meter value MV₁, MV₃-MV₇ ofthe meters M₁, M₃-M₇, respectively.

The nodes N4, N5, and N7 have meter values MV4, MV5, and MV7 larger thanthe meter limit values MLV4, MLV5, and MLV7, and will therefore bemapped to the meter scheduler of the node NB, NC or port P, the meterscheduler being indicated by Min in FIG. 4.

The nodes N1 and N6 have a meter value less than the meter limit valuesMLV1 and MLV6, respectively, and will therefore be scheduled independence of their shaper value. Therefore the nodes N1 and N6 aremapped to the shaper scheduler of node NB, NC or port P, the shaperscheduler being indicated by Max in FIG. 4. By means of RR scheduling orWFQ scheduling, the nodes N2, N5 and N1 will be mapped to the strictscheduling priority levels 2, 1, and 0, respectively. Thereafter, astrict priority scheduling will schedule node N2 as the highest prioritynode.

FIG. 5 schematically illustrates scheduling of scheduling nodes at B, C,or P level scheduling nodes, with propagated priorities. As illustratedin the figure, the node N0-N7 has the same configured priority as theconfigured priority of the nodes shown in FIG. 4. Further, the metervalues of the meters of the nodes correspond to the meter values of themeters of the nodes in FIG. 4. The same is true for the shaper values ofthe shapers. However, in FIG. 5, priorities are propagated from a childnode to a parent node. In the shown example, the parent nodes N0-N7 havethe propagated priority 0, 2, 1, 0, 3, 3, 2, 0, respectively.

As illustrated, the nodes N0 and N2 configured with a strict prioritySPQ are mapped to a SPQ scheduler of the parent node independently ofthe propagated priority. The nodes N4 and N5 configured with a normalpriority, having a meter value MV4 and MV5 larger than the meter limitvalues MVL4 and MVL5, respectively, and the highest propagated priority,3, are mapped to the highest meter scheduler indicated as Min3 in theFIG. 5.

No nodes are configured with normal priority, have a meter value largerthan a meter limit value and a propagated priority of 2 or 1, andtherefore no nodes are mapped to the next highest meter schedulerindicated as Min2, and to the next-next highest meter schedulerindicated as Min1, in FIG. 5.

Only one node, node N7, has a normal configured priority, a meter valueMV7 larger than the meter limit value MVL7, and a propagated priority of0 and is therefore mapped with the lowest meter scheduler indicated asMin0 in FIG. 5.

Two nodes, node N1 and N6, are configured with normal priorities, buthave meter values MV1 and MV6 less than the meter limit values MVL1 andMVL6, respectively, and will therefore not be mapped to one of the meterschedulers Min3, Min2, Min1, or Min0. Instead, the nodes N1 and N6 willbe mapped to a shaper scheduler, indicated as Max in FIG. 5, since theshaper values SV1 and SV6 are larger than the shaper limit values SVL1and SVL6, respectively.

By means of RR scheduling or WFQ scheduling, node N2 is selected overnode N0 since N2 has a higher propagated priority, i.e. 1 instead of 0.Since node N2 is configured with a strict priority, node N2 is given thehighest strict scheduling priority value, i.e. 5.

By means of RR scheduling or WFQ scheduling, node N5 is selected overnode N4 and mapped with the next highest strict scheduling priorityvalue, i.e. 4. Further, since node N7 is the only node mapped with thelowest meter scheduler Min0, it is given the next lowest priority value,i.e. 1.

By means of RR scheduling or WFQ scheduling, node N1 is selected overnode N6 and given the lowest scheduling priority value, i.e. 0. Finally,by means of strict priority scheduling, node N2 is scheduled as thehighest priority node and it's scheduling priority may be propagated toa higher level.

FIG. 6 schematically illustrates an example of a single queue highpriority propagation in a scheduling hierarchy. The scheduling ofqueues, nodes, and ports may be accomplished as previously described andthe determined scheduling priority may be propagated in the hierarchy.Propagated priority is a way to prefer nodes with higher priority queuesand meter values, e.g. conforming minimum (committed) token buckets, allthe way through the hierarchies, over other nodes that have only lowerpriority queues, and that have high priority queues but no meter value,conforming minimum (committed) token buckets, all the way through thehierarchies.

In the shown scheduling hierarchy, four propagated priority levelsexist, each of which priority levels behaves on the same principles. Asillustrated, priority is propagated from the queues to the A node, fromthe A nodes to the B nodes, from the B nodes to the C nodes, and fromthe C nodes to the ports. However, it should be understood that thepriority propagation can be limited per scheduler, e.g. the prioritypropagation can be limited to the queue level and the A node level, butnot allowed above the B node level. Propagated priorities are used bythe node scheduling logic, so that the node with the highest priorityqueue is selected before a node with a lower priority queue.

In cases with dynamic priorities, both on a queue level and on a nodelevel, the propagated priorities are visible to the scheduling logic aslong as the meter value are higher than a limit value, i.e. as long asminimum tokens exists. When the meter value is less than a limit value,i.e. when the minimum tokens are depleted, the node priority is changedto the lowest priority.

Strict priority can be enabled for any node in the scheduling hierarchy.Strict priorities are served with the highest priority ignoring theirpropagated priority. However, they still propagate their propagatedpriority (from lower levels) on the same manner as node configured ashaving a normal priority.

Scheduling priorities to queues are shown in Table 1. The scheduledpriorities are propagated to upper levels, with the propagated prioritymechanism. Locally they are used for scheduling a queue. The label MinTB represents a meter value and if it is indicated as “yes”, the metervalue is larger than the meter limit value, i.e. high priority. If it isindicated as “no”, the meter value is less than the meter limit value,i.e. low priority. The label Max TB represents a shaper value and“yes”/“no” indicates that the shaper value is larger/less than theshaper limit value.

TABLE 1 Data Queue priority Min TB Max TB Scheduling Priority Yes*** HiYes* DC 3 Yes Medium Yes DC 2 Yes Dynamic Yes DC 1 Yes Dynamic No Yes**0 No DC DC DC DC**** Yes Hi No DC DC Yes Medium No DC DC Yes Dynamic NoNo DC *Min Token Bucket is on conforming status **Max Token Bucket is onconforming status ***Queue has data to transmit ****the queue is notvalid for scheduling

Scheduling priorities to nodes are shown in Table 2. Theses prioritiesare calculated based on configuration priority and on Token bucketstate, e.g. based on the meter value of the associated meter and theshaper value of the associated shaper. The scheduling priorities arepropagated to upper levels with the propagated priority mechanism.Locally the scheduling priorities are used for scheduling a node.

TABLE 2 Strict Min Propagated Max Scheduling Data priority TB PriorityTB Priority Yes*** Yes Yes* DC DC 5 Yes No Yes 3 DC 4 Yes No Yes 2 DC 3Yes No Yes 1 DC 2 Yes No Yes 0 DC 1 Yes No No DC Yes** 0 No DC DC DC DCDC**** Yes Yes No DC DC DC Yes No No DC No DC *Min Token Bucket is onconforming status **Max Token Bucket is on conforming status ***Node hasqueues with data under it and tokens all the way up the tree ****theNode is not valid for scheduling

Scheduling priorities to ports are shown in table 3. These prioritiesare calculated based on configuration priority and on Token bucketstate, e.g. the meter value of the associated meter and the shaper valueof the associated shaper. The scheduling priorities are used forscheduling a port.

TABLE 3 Data Strict Priority Min TB Max TB Scheduling Priority Yes***Yes Yes* DC 2 Yes No Yes Yes** 1 Yes No No Yes 0 No DC DC DC DC**** YesYes No DC DC Yes No No No DC *Min Token Bucket is on conforming status**Max Token Bucket is on conforming status ***Port has queues with dataunder it and tokens all the way up the tree ****The Port is not validfor scheduling

FIG. 7 schematically illustrates a network processor 10 comprising anembodiment of a traffic manager 1. As illustrated, the network processor10 comprises a processing means 18 and a data packet interface 12arranged to receive and/or transmit data packets. For example, theinterface 12 may be 100 Mbps Ethernet MACs, Gigabit Ethernet MACs,10-Gigabit Ethernet MACs, PCIe interfaces, SPI-4.2 interfaces,Interlaken interfaces, etc.

The interface 12 is arranged in communication with the traffic manager 1and with a packet buffer 14, e.g. a shared memory switch. The sharedmemory switch 14 is configured to temporarily store data packets leavingone of the network subsystems, e.g. the interface 12, the trafficmanager 1 or the processing means 18. Thus, the shared memory switch 14may be arranged to interconnect network subsystems.

The processing means 18 may be a dataflow pipeline of one or moreprocessors with special-purpose engines. The processing means 18 isconfigured to classify and edit data packet information, e.g. headers,to perform functionalities such as switching, forwarding andfirewalling.

Further, an optional external memory 20 for one or more databases,comprising e.g. packet forwarding tables, may be arranged incommunication with the network processor 10.

In embodiments of the network processor 10, a data packet is receivedvia the interface 12 of the network processor 10. The interface 12writes the packet to the shared memory switch 14 that buffers thepacket. The shared memory switch 14 writes the packet to the processingmeans 18 that processes the packets and sets packet information for useby the traffic manager 1. The processing means 18 writes the packet tothe traffic manager 1. Optionally, the processing means 18 writes thepacket to the traffic manager via the shared memory switch 14. Further,the traffic manager 1 is configured to decide what action to be takenbased on e.g. packet information prepared by the processing means 18. Ifthe packet is enqueued in the traffic manager and later dequeued, thetraffic manager writes the packet either to the interface 12, optionallyvia the shared memory switch 14, or back to the processing means 18,optionally via the shared memory switch 14, for post-processing. Fromthe processing means the packet is written back to the interface 18,optionally via the shared memory switch 14. Finally, the packet istransmitted from the network processor 10 via an interface 18.

FIG. 8 schematically illustrates a router or a switch 30 comprising oneor more traffic managers 1, such as one or more ingress traffic managers1 and one or more egress traffic managers 1. The router/switch 30 mayalso comprise several network processors 10. A traffic manager 1 may becomprised in a network processor 10. However, it should be understoodthat a network processor 10 may be configured without a traffic manager1.

As illustrated in FIG. 8, the router/switch 30 may comprise one or moreinput queues to a switch fabric 32, e.g. a bus, comprised in therouter/switch 30, and one or more output queues 34 from the switchfabric 32. The input queues may be virtual output queues 8′ of a trafficmanager.

It should be understood that embodiments of the present invention areconfigured to guarantee minimum rate at any scheduling level. Forexample, a scheduling hierarchy may have a group of child nodes at ascheduling level that are associated with a parent node at the nextlevel. The nodes have a minimum rate relating to their meter values anda maximum rate relating to their shaper value. Assume that the bandwidthoffered to the parent node is greater than or equal to the sum ofminimum rates of the child nodes but less than the sum of maximum ratesof the child nodes. In such situations, even if all prioritiespropagated to the child nodes are zero, the child nodes which do notexceed their minimum rates are scheduled with a higher priority than thechild nodes exceeding their minimum rates, and consequently guaranteeingminimum rate at any level.

Minimum rate guarantee at any scheduling level is also exemplified inthe following example:

Assume that a node at level A corresponds to a network user and that aqueue corresponds to an application. Also assume that each user has atleast one “best effort” application queue; e.g. for web traffic, whichhas a maximum rate but not a minimum rate. Further, assume that eachuser has a maximum rate and a minimum rate. By configuring the userminimum rates such that their sum does not exceed the availablebandwidth allocated to the level B node aggregating the users, each useris guaranteed to get her minimum rate.

Although, the present invention has been described in accordance withthe embodiments shown, one of ordinary skill in the art will readilyrecognize that there could be variations made without departing from thescope of the invention. For example, as previously described, it shouldbe understood that the meter value and/or the shaper value may beincreased instead of decreased when a data packet is read from a queue.Accordingly, it is intended that all matter contained in the abovedescription and shown in the accompanying drawings shall be interpretedas illustrative and not in a limiting sense.

1. A method for a traffic manager (1) comprising a hierarchicalscheduler (6), a number of queues (Q₁-Q_(Q)) and a number of schedulingnodes (N₁-N_(N)) hierarchically arranged at one or more schedulinglevels (L₁-L_(L)), each scheduling node (N₁-N_(N)) being configured toserve a queue (Q₁-Q_(Q)) or a scheduling node (N₁-N_(N)) of a lowerscheduling level (L₁-L_(L)) according to priorities, the methodcomprising the step of reading a first data packet (D₁-D_(D)) comprisedin a first queue (Q₁-Q_(Q)) based on a scheduling priority (SP), thescheduling priority (SP) being determined: at least partly on aconfigured priority (CP₁-CP_(Q)) of the first queue (Q₁-Q_(Q)), at leastpartly on a first meter value (MV₁-MV_(M)) of a first meter (M₁-M_(M))associated with the first queue (Q₁-Q_(Q)); and at least partly on asecond meter value (MV₁-MV_(M)) of a second meter (M₁-M_(M)) associatedwith a first scheduling node (N₁-N_(N)); the first scheduling node beinga parent node of the first queue.
 2. A method according to claim 1,wherein the scheduling priority (SP) being further determined at leastpartly on a configured priority (CP₁-CP_(N)) of the first schedulingnode (N₁-N_(N)).
 3. A method according to claim 1, wherein the first andsecond meters (M₁-M_(M)) being configured to dynamically change theirpriority level between a high priority level when the first and secondmeter value (MV₁-MV_(M)), respectively, is equal to or higher than afirst and second meter limit value (MLV₁-MLV_(M)), respectively, and alow priority level when the first and second meter value (MV₁-MV_(M)),respectively, is lower than the first and second meter limit value(MLV₁-MLV_(M)).
 4. A method according to claim 1, further comprising thesteps of: decreasing the first and second meter values (MV₁-MV_(M)) ofthe first and second meters (M₁-M_(M)), respectively, with a valuecorresponding to the amount of bits of the first data packet (D₁-D_(D)),if the first data packet (D₁-D_(D)) is read from the first queue(Q₁-Q_(Q)); and periodically increasing the first and second metervalues (MV₁-MV_(M)) of the first and second meters (M₁-M_(M)),respectively, by a meter value amount.
 5. A method according to claim 1,wherein the first and second meter (M₁-M_(M)) being configured todynamically change its priority level between a low priority level whenthe first and second meter value (MV₁-MV_(M)), respectively, is higherthan a first and second meter limit value (MLV₁-MLV_(M)), respectively,and a high priority level when the first and second meter value(MV₁-MV_(M)), respectively, is equal to or less than the first andsecond meter limit value (MLV₁-MLV_(M)).
 6. A method according to claim5, further comprising the steps of: increasing the first and secondmeter values (MV₁-MV_(M)) of the first and second meters (M₁-M_(M)),respectively, with a value corresponding to the amount of bits of thefirst data packet (D₁-D_(D)), if the first data packet (D₁-D_(D)) isread from the first queue (Q₁-Q_(Q)); and periodically decreasing thefirst and second meter values (MV₁-MV_(M)) of the first and secondmeters (M₁-M_(M)), respectively, by a meter value amount.
 7. A methodaccording to claim 1, further comprising the step of reading the firstdata packet (D₁-D_(D)) based: at least partly on a first shaper value(SV₁-SV_(M)) of a first shaper (S₁-S_(M)) associated with the firstqueue (Q₁-Q_(Q)); and at least partly on a second shaper value(SV₁-SV_(M)) of a second shaper (S₁-S_(M)) associated with the firstscheduling node (N₁-N_(N)).
 8. A method according to claim 7, furthercomprising the steps of: decreasing the first and second shaper value(SV₁-SV_(M)) of the first and a second shapers (S₁-S_(M)) associatedwith the first queue (Q₁-Q_(Q)) and the first scheduling node(N₁-N_(N)), respectively, with a value corresponding to the amount ofbits of the first data packet (D₁-D_(D)), if the first data packet(D₁-D_(D)) is read from the first queue (Q₁-Q_(Q)); and periodicallyincreasing the first and second shaper values (SV₁-SV_(M)) of the firstand second shapers (S₁-S_(M)) by a shaper value amount.
 9. A methodaccording to claim 7, further comprising the steps of: increasing thefirst and second shaper value (SV₁-SV_(M)) of the first and a secondshapers (S₁-S_(M)) associated with the first queue (Q₁-Q_(Q)) and thefirst scheduling node (N₁-N_(N)), respectively, with a valuecorresponding to the amount of bits of the first data packet (D₁-D_(D)),if the first data packet (D₁-D_(D)) is read from the first queue(Q₁-Q_(Q); and periodically decreasing the first and second shapervalues (SV₁-SV_(M)) of the first and second shapers (S₁-S_(M)) by ashaper value amount.
 10. A method according to claim 1, furthercomprising the step of reading the first data packet (D₁-D_(D)) based:at least partly on a propagated priority (PP) corresponding to ascheduling priority (SP) propagated from the first queue (Q₁-Q_(Q)) at alower scheduling level (L₁-L_(L)) to the first scheduling node(N₁-N_(N)) at a higher scheduling level (L₁-L_(L)).
 11. A methodaccording to claim 1, further comprising the step of reading the firstdata packet (D₁-D_(D)) based: at least partly on a propagated priority(PP) corresponding to a scheduling priority (SP) propagated from thefirst scheduling node (N₁-N_(N)) at a lower scheduling level (L₁-L_(L))to a parent scheduling node (N₁-N_(N)) at a higher scheduling level(L₁-L_(L)).
 12. A method according to claim 1, wherein the configuredpriority (CP) of the first queue (Q₁-Q_(Q)) is strict priority, e.g.high priority, medium priority or low priority; or dynamic priority. 13.A method according claim 1, wherein the configured priority (CP) of thefirst scheduling node (N₁-N_(N)) is strict priority or normal priority.14. A method according to claim 1, wherein the scheduling priority (SP)at a queue level is 3, 2, 1, 0, or DC; 3 being the highest schedulingpriority.
 15. A method according to claim 1, wherein the schedulingpriority (SP) at a node level is 5, 4, 3, 2, 1, 0, or DC; 5 being thehighest scheduling priority.
 16. A traffic manager (1) comprising ahierarchical scheduler (6), a number of queues (Q₁-Q_(Q)) and a numberof scheduling nodes (N₁-N_(N)) hierarchically arranged at one or morescheduling levels (L₁-L_(L)), each scheduling node (N₁-N_(N)) beingconfigured to serve a queue (Q₁-Q_(Q)) or a scheduling node (N₁-N_(N))of a lower scheduling level (L₁-L_(L)) according to priorities, thetraffic manager (1) being adapted to read a first data packet (D₁-D_(D))comprised in a first queue (Q₁-Q_(Q)) based on a scheduling priority(SP), the scheduling priority (SP) being determined: at least partly ona configured priority (CP) of the first queue (Q₁-Q_(Q)), at leastpartly on a first meter value (MV₁-MV_(M)) of a first meter (M₁-M_(M))associated with the first queue (Q₁-Q_(Q)); and at least partly on asecond meter value (MV₁-MV_(M)) of a second meter (M₁-M_(M)) associatedwith the first scheduling node (N₁-N_(N)); the first scheduling nodebeing a parent node of the first queue.
 17. A traffic manager accordingto claim 16, further being adapted to determined the scheduling priority(SP) at least partly on a configured priority (CP) of a first schedulingnode (N₁-N_(N)).
 18. A traffic manager (1) according to claim 16,wherein the first and second meter (M₁-M_(M)) being configured todynamically change its priority level between a high priority level whenthe first and second meter value (MV₁-MV_(M)), respectively, is equal toor higher than a first and second meter limit value (MLV₁-MLV_(M)),respectively, and a low priority level when the first and second metervalue (MV₁-MV_(M)), respectively, is lower than the first and secondmeter limit value (MLV₁-MLV_(M)), respectively.
 19. A traffic manageraccording to claim 18, further being adapted to: decrease the first andsecond meter values (MV₁-MV_(M)) of the first and second meters(M₁-M_(M)), respectively, with a value corresponding to the amount ofbits of the first data packet (D₁-D_(D)), if the first data packet(D₁-D_(D)) is read from the first queue (Q₁-Q_(Q)); and periodicallyincrease the first and second meter values (MV₁-MV_(M)) of the first andsecond meters (M₁-M_(M)), respectively, by a meter value amount.
 20. Atraffic manager (1) according to claim 16, wherein the first and secondmeter (M₁-M_(M)) being configured to dynamically change its prioritylevel between a low priority level when the first and second meter value(MV₁-MV_(M)), respectively, is higher than a first and second meterlimit value (MLV₁-MLV_(M)), respectively, and a high priority level whenthe first and second meter value (MV₁-MV_(M)), respectively, is equal toor less than the first and second meter limit value (MLV₁-MLV_(M)),respectively.
 21. A traffic manager according claim 20, further beingadapted to: increase the first and second meter values (MV₁-MV_(M)) ofthe first and second meters (M₁-M_(M)), respectively, with a valuecorresponding to the amount of bits of the first data packet (D₁-D_(D)),if the first data packet (D₁-D_(D)) is read from the first queue(Q₁-Q_(Q)); and periodically decrease the first and second meter values(MV₁-MV_(M)) of the first and second meters (M₁-M_(M)), respectively, bya meter value amount.
 22. A traffic manager according to claim 16,further being adapted to read the first data packet (D₁-D_(D)) based: atleast partly on a first shaper value (SV₁-SV_(M)) of a first shaper(S₁-S_(M)) associated with the first queue (Q₁-Q_(Q)); and at leastpartly on a second shaper value (SV₁-SV_(M)) of a second shaper(S₁-S_(M)) associated with the first scheduling node (N₁-N_(N)).
 23. Atraffic manager according to claim 22, further being adapted to:decrease the first and second shaper value (SV₁-SV_(M)) of the first anda second shapers (S₁-S_(M)) associated with the first queue (Q₁-Q_(Q))and the first scheduling node (N₁-N_(N)), respectively, with a valuecorresponding to the amount of bits of the first data packet (D₁-D_(D)),if the first data packet (D₁-D_(D)) is read from the first queue(Q₁-Q_(Q); and periodically increase the first and second shaper values(SV₁-SV_(M)) of the first and second shapers (S₁-S_(M)) by a shapervalue amount.
 24. A traffic manager according to claim 22, further beingadapted to: increase the first and second shaper value (SV₁-SV_(M)) ofthe first and a second shapers (S₁-S_(M)) associated with the firstqueue (Q₁-Q_(Q)) and the first scheduling node (N₁-N_(N)), respectively,with a value corresponding to the amount of bits of the first datapacket (D₁-D_(D)), if the first data packet (D₁-D_(D)) is read from thefirst queue (Q₁-Q_(Q)); and periodically decrease the first and secondshaper values (SV₁-SV_(M)) of the first and second shapers (S₁-S_(M)) bya shaper value amount.
 25. A traffic manager according to claim 16,further being adapted to read the first data packet (D₁-D_(D)) based: atleast partly on a propagated priority (PP) corresponding to a schedulingpriority (SP) propagated from the first queue (Q₁-Q_(Q)) at a lowerscheduling level (L₁-L_(L)) to the first scheduling node (N₁-N_(N)) at ahigher scheduling level (L₁-L_(L)).
 26. A traffic manager according toclaim 16, further being adapted to read the first data packet (D₁-D_(D))based: at least partly on a propagated priority (PP) corresponding to ascheduling priority (SP) propagated from the first scheduling node(N₁-N_(N)) at a lower scheduling level (L₁-L_(L)) to a parent schedulingnode (N₁-N_(N)) at a higher scheduling level (L₁-L_(L)).
 27. A trafficmanager according to claim 16, wherein the configured priority (CP) ofthe first queue (Q₁-Q_(Q)) is strict priority, e.g. high priority,medium priority or low priority; or dynamic priority.
 28. A trafficmanager according to claim 16, wherein the configured priority (CP) ofthe first scheduling node (N₁-N_(N)) is strict priority or normalpriority.
 29. A traffic manager according to claim 16, wherein thescheduling priority (SP) at a queue level is 3, 2, 1, 0, DC; 3 being thehighest scheduling priority.
 30. A traffic manager according to claim16, wherein the scheduling priority (SP) at a node level is 5, 4, 3, 2,1, 0, DC; 5 being the highest scheduling priority.
 31. A router, switchor computer unit comprising a traffic manager (1) according to claim 16.